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Abstract 

We enumerate all ternary length-^ square- free words, which are words avoiding 
squares of words up to length I, for £ < 24. We analyse the singular behaviour of 
the corresponding generating functions. This leads to new upper entropy bounds 
for ternary square- free words. We then consider ternary square- free words with 
fixed letter densities, thereby proving exponential growth for certain ensembles with 
various letter densities. We derive consequences for the free energy and entropy of 
ternary square-free words. 

1 Introduction 

The interest in the combinatorics of pattern-avoiding E] , in particular of power-free 
words, goes back to work of Axel Thue in the early 20th century [Ho^ . The celebrated 
Prouhet-Thue-Morse sequence, denned by a substitution rule a — > ab and b — ► ba on a 
two-letter alphabet {a, b}, proves the existence of infinite cube-free words in two letters 
a and b. 

Here, a word of length n is a string of n letters from a certain alphabet S, an element 
of the language C(n) = E n of n-letter words in E. The union 



£ = (J C(n) 



V N 
n>0 



is the language of all words in the alphabet E. It is a monoid, with concatenation of 
words as operation, and with the empty word A of zero length as neutral element |22j . 
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A word w is called square-free if w = xyyz, with words x, y and z, implies that y = A 
is the empty word, and cube-free words are defined analogously. So square-free words 
are characterised by the property that they do not contain an adjacent repetition of any 
sub word. 

It is easy to see that there are only a few square-free words in two letters, these are 
the empty word A, the two letters a and b, the two-letter words ah and ba, and, finally, 
the three-letter words aba and bob. Appending any letter to those two words inevitably 
results in a square, either of a single letter, or of one of the square-free two-letter words. 

However, there do exist infinite ternary square- free words, i.e., square-free words on 
a three- letter alphabet. In fact, the number s n of ternary square- free words of length n 
grows exponentially with n. Denoting set sets of ternary square-free words of length n by 
A n , we have 

A = {A}, 

At = {a,b,c}, 

A2 = {ab,ac,ba,bc,ca,cb}, 

A3 = {aba,abc,aca,acb,bab,bac,bca,bcb,cab,cac,cba,cbc}, (2) 

and so on. So s = 1, si = 3, s 2 = 6, s 3 = 12, and so on, see [T] and [12] where the values 
of s n for n < 90 and 91 < n < 110 are tabulated, respectively. In [2H], the sequence s n is 
listed as A006156 (formerly M2550). 

In this article, we consider ternary square-free words [HSJ EH1 EHl E3 HU HI 03 HH 1221 
!2Hll2Ill2IllIHllIllIDll211II21iniE21- We are interested in the asymptotic growth of the 
sequence s n . We use a series of generating functions for a truncated square-freeness 
condition and conjecture the presence of a natural boundary at the radius of convergence. 
We also consider the frequencies of letters in ternary square-free words and derive upper 
and lower bounds. We prove exponential growth for certain ensembles of ternary square- 
free words with fixed letter frequencies. We use methods of statistical mechanics [T7] to 
prove that, subject to a plausible regularity assumption on the free energy of ternary 
square-free words, the maximal exponential growth occurs for words with equal mean 
letter frequencies, where we average over all square-free words. Some of our results are 
based on extensive exact enumerations of square- free ternary words of length n < 110 J2] 
and on constructions of generalised Brinkhuis triples ^T] ^] . 



2 Ternary square- free words 

Denote the number of ternary square-free words by s n and the corresponding generating 
function by S(x), 

00 

S(X) = Y,SnX n . (3) 

n=0 
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Since the language of ternary square-free words is subword-closed, we conclude that the 
sequence s n is submultiplicative, 

Sn+m — Sn ■ (4) 

A standard argument, compare PJ Lemma 1] and fZj Lemma A.l], shows that this 
guarantees that the limit S := lim^oo - logs„, also called the entropy, exists. Bounds 
for the limit have been obtained in a number of investigations [2J IH HU EH 12H EE3 L12] 
which give 

1.1184 « 110 1/42 < exp(S) < 1.30201064, (5) 

but the exact value is unknown. The lower bound implies an exponential growth of s n 
with n. The behaviour of the subleading corrections to the exponential growth is not 
understood. 

One of the authors computed the numbers s n for n < 110 Assuming an asymp- 
totic growth of the numbers s n of the form 

s n ~ Ax~ n tf- 1 (n->oo), (6) 

we used differential approximants of first order to get estimates of the critical point 
x c = exp(— <S), the critical exponent 7 and the critical amplitude A. We obtain 

A = 12.72(1), x c = 0.768189(1), 7 = 1.0000(1), (7) 

where the number in the bracket denotes the (estimated) uncertainty in the last digit. 
The value of 7, also found in [21], suggests a simple pole as dominant singularity of the 
generating function at x = x c . Numerical analysis indicates the presence of a natural 
boundary, a topic which we considered further by computing approximating generating 
functions S^(x), which count the number of words which contain no squares of words of 
length < i. 



3 Generating functions 

We call a word w G C length-£ square-free if w = xyyz, with x, z 6 C and y G U n =o ^( n )' 
implies that y is the empty word A. In other words, w does not contain the square of a 
word of length < t. 

Denote the number of ternary length-^ square-free words of length n by s n ■ Clearly, 
£' > i implies Sn < Sn\ because at least the same number of words are excluded. On 

(£') (£) 

the other hand, we have s n = s n = s n for n < 21. Thus, by considering larger and 
larger squares £, we approach the case of square-free words. 
We define corresponding generating functions 

00 

S (£ \x) = 5>?* n (8) 

n=0 
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for the number of ternary length-^ square-free words. These generating functions are 
rational functions of the variable x which can be calculated explicitly, at least for small 
values of £, see where the computation is explained in detail. The first few generating 
functions are 

S<®{x) 1 
S®(x) 



1 -3x ' 
1+x 



1 - 2x ' 

1 + 2x + 2x 2 + 3x 3 
1 — X — X 2 

l + 3x + 6x 2 + llx 3 + 14a; 4 + 20x 5 + 20x 6 + 21x 7 + 12a; 8 + 6x 9 (l-x-x 2 -x 3 -a; 4 ) 



We computed the generating functions S^ l \x) explicitly for I < 24. The functions are 
available as Mathematica code [37] at [Tl|. Note that some generating functions agree; 
for instance, S^(x) = S^(x). The reason is that, going from £ = 4 to £ = 5, no "new" 
squares arise; in other words, all squares of square-free words of length 5 already contain 
a square of a word of smaller length. 

The radius of convergence Xc < x c of the series defining the generating function 
(x) is determined by a pole in the complex plane located closest to the origin, thus by 
a zero of the denominator polynomial of smallest modulus. Due to Pringsheim's theorem 
|3U| Sec. 7.21], a real and positive such zero exists. Note that the zeros of the numerator 
and denominator are mutually exclusive, because the do not contain common polynomial 
factors. 

The values Xc are given in Table [U together with the degrees d nnm and <iden of the 
polynomials in the numerator and in the denominator which both grow with £. Thus, 
with growing length £, the generating functions S^^x) have an increasing number of zeros 
and poles. The patterns of zeros and poles appear to accumulate in the complex plane 
close to the unit circle around the origin; and comparing the patterns for increasing £ 
one might be tempted to the plausible conjecture that the poles approach the unit circle 
in the limit as £ — * oo. However, there appear to be some oscillations in the patterns 
close to the real line, and at present we dot not have any argument why the poles should 
accumulate on the unit circle. 

The values xlp in Table Q approach x c from below, so they yield upper bounds on 
the exponential growth constant S = — log(x c ). The upper bound quoted in equation © 
above was given in [21] on the basis of an estimate for x^ obtained via the series expan- 
sion of S^ 23 \x). Our value for xf 3 ', based on the complete evaluation of the generating 
function S^ 23 \x), is contained in Tabled it confirms the bound of Noonan and Zeilberger 
The value for £ = 24 slightly improves the upper bound. 



Theorem 1. The entropy S of ternary square-free words is bounded as S < — \og(x^), 
which gives exp(<S) < 1.30193812 < l/xi 24) . □ 
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Table 1: Degrees d num and cfaen of the numerator and denominator polynomials of the 
generating functions S^ e \x), respectively, and the numerical values of the radius of con- 
vergence x c ■ 



K, 


A 

"num 


<7j 

"'den 


•I'C 








1 


0.333 333 333 


1 


1 


1 


0.500 000 000 


2 


3 


2 


0.618 033 989 


3 


5 


3 


0.682 327804 


4,5 


13 


6 


0.724491959 


6,7 


27 


15 


0.750 653 202 


8, 9, 10 


38 


19 


0.757826 433 


11 


81 


58 


0.762 463 266 


12 


143 


106 


0.765 262 611 


13, 14 


184 


145 


0.766 784 948 


15 


209 


170 


0.767006 554 


16, 17 


217 


178 


0.767136 379 


18 


441 


380 


0.767 542 044 


19 


644 


594 


0.767 752 831 


20 


968 


890 


0.767887486 


21 


1003 


925 


0.767896 727 


22 


1436 


1337 


0.767974175 


23 


1966 


1872 


0.768 042 881 


24 


2905 


2787 


0.768 085 659 



The complete set of poles of the generating function (x) is shown in Fig. ^ The 
pattern looks very similar for other values of £. This suggests that, in the limit as I 
becomes infinite, which corresponds to the generating function S(x) of ternary square- 
free words, the poles accumulate close to the unit circle. This corroborates the conjecture 
that S(x) has a natural boundary. 



4 Square-free words with fixed letter frequencies 

We now consider the letter statistics of ternary square-free words. Denote the number of 
occurrences of the letter a in a ternary square-free word w n of finite length n by a(w n ). 
Clearly, the frequency of the letter a in w n is < a(w n )/n < 1. For an infinite ternary 
square-free word w, letter frequencies do not generally exist. Consider sequences {w n } of 
n-letter subwords containing arbitrarily long words. We define upper and lower frequencies 
fa > fa b y fa : = su P{w n } lim sup^^ a(w n )/n and /~ := M {Wn} lim inf^^ a(w n )/n, 
where we take the supremum and infimum over all sequences {w n }. We can also compute 
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Figure 1: Pattern of poles of the generating functions S^ 2i \x) in the complex plane. The poles 
(red) accumulate along the unit circle (green). The isolated pole at on the real positive 
axis determines the radius of convergence. 

these from a+ = max^c, a(w n ) and a~ = min WnCw a(w n ) by = Hindoo a^/n, as these 
limits exist. This follows, for instance, from the subadditivity of the sequences {a^} and 
{1 — a~}. If the infinite word w is such that fa = fa =: fa, Vf e call f a the frequency of 
the letter a in w. In general, /+ > /~, and letter frequencies do not exist, see also the 
discussion below. 

However, we can derive bounds on the upper and lower letter frequencies and /~. 
Denote the number of ternary square-free words of length n which contain the letter a 
exactly k times by s n ^- Since there are no square-free words of length greater than three 
in two letters, a ternary square-free word contains no gaps between letters a of length 
greater than three. This implies s n ^ = for k < n/4 or k > n/2, since the minimal 
number of letters b and c is, by the same argument, equal to k — n/2. By counting 
the number s n ^ of ternary square-free words with a given number k of letters a, we 
can sharpen these bounds. Clearly, for fixed k, there are numbers n m i n (fc) and n max (/c) 
such that s ny k = for n < n min (k) and n > n max (k). This means that any ternary 
square-free word of length (m + l)n max (k) > n > mn max (k), for any integer m, contains 
at least mk + 1 letters a, so the frequency of the letter a is bounded from below by 
(mk + 1) / (mn max (k) + 1), which becomes k/n max (k) as m tends to infinity. Similarly, any 
word of length mn min (/c) > n > (m—l)n min (k) contains at most mk — 1 letters a. Thus we 
obtain an upper limit of (mk — l)/(mn min (k) — 1), which becomes k/n m i n (k) as m tends 
to infinity. We computed n max (k) for A; < 31 and n min (fc) for k < 40; the strongest bounds 
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are derived from n max (31) = 117 and n min (39) = 97, which yield lower and upper bounds 
31/117 ~ 0.265 and 39/97 ~ 0.402, respectively, for the frequency of a single letter in an 
infinite ternary square-free word. This gives 

Theorem 2. The upper and lower frequencies f ± of a given letter in an infinite ternary 
square-free word are bounded by 0.265 ~ 31/117 < /~ < f + < 39/97 ~ 0.402. □ 



Remark. In fact, there is a recent, stronger result for the lower frequency 
minimum frequency f~ { is bounded from below and above by 



The 



0.274649 « 1780/6481 < / min < 64/233 « 0.274678 . 



compare also similar treatments for binary power-free words II 
can be sharpened to /+ < 469/1201 « 0.390508 jSlj. 



20 . The upper bound 



It is easy to see that the mean letter frequency of any given letter in the set of ternary 
square-free words is 1/3. This is a consequence of symmetry under permutation of letters. 
Indeed, the symmetric group 53 acts on any square-free word w by permutation of the 
three letters, and the set of square-free words of a given length is a disjoint union of 
orbits under this action. Each orbit consists of a square-free word and its images under 
permutation of letters, and each letter has the same mean frequency on this orbit. So, 
for each orbit, the mean frequency of any given letter is 1/3, thus also for the set of all 
ternary square free words of any given length, or indeed for the set of all ternary square 
free words. 

We now want to show that there exist ternary square-free words of infinite length with 
well-defined letter frequencies for the case f a = = f c = 1/3 and for some cases where 
not all letter are equally frequent. In fact, we are going to prove not just that, but that 
there are exponentially many such words, so the growth rate for words of fixed frequencies, 
at least for the cases considered below, is positive. This can be done in a similar fashion 
as the proofs that the number of ternary square-free words grow exponentially jSJ EJ El 
HOI I2H H21 E21 These proofs are based on Brinkhuis triple pairs 011111111011211 and their 
generalisations [HI H21 E2| • We briefly sketch the argument here, see 001111110112111121 
32J for details. 

The argument is based on square-free morphisms [HIIZI- Here, we immediately consider 
the generalised version of [HI U2] • Assume that we have a set of substitution rules 



w 



(i) 

a 

(2) 



W 



(k) 
W y a 



< (1) 

(2) 
b 



< (1) 

Wc 



(2) 



W 



(fc) 



w, 



(k) 

VJc 



(9) 



where Wa , wjj* and Wc \ 1 < j < k, are ternary square-free words of equal length m. 
Starting from any ternary square-free word w of length n, consider the set of all words 
of length mn obtained by substituting each letter, choosing independently one of the k 
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words from the lists above. A generalised Brinkhuis triple is defined as a set of substitution 
rules such that all these words of length ran are square- free, for any choice of w. This 
immediately implies that the number of square-free words grows at least as fc 1 /^" 1 ), see 
[T2J Lemma 2]. In the case k — 1, this reduces to a usual substitution rule without any 
freedom; in this case, it only proves existence of infinite words, not exponential growth of 
the number of words with length. 

In [T2] , a special class of generalised Brinkhuis triples was considered, and triples up 
to length m = 41 with k = 65 were obtained. This was recently improved to m = 43 and 
k = 110 in [32], yielding the lower bound of (j3J). 

What about the letter frequencies? In general, the words Wa^ that replace a will have 
different letter frequencies, and in this case it is easy to see that not all the infinite words 
obtained by repeated substitution will have well-defined letter frequencies. However, we 
can say something about letter frequencies if we consider generalised Brinkhuis triples 
where all words Wd , 1 < j < k, have the same letter frequencies, and analogously for the 
words , 1 < j < k, and Wc\ 1 < j < k. In this case, regardless of our choice of words 
in the substitution process, we obtain words with well-defined letter frequencies, precisely 
as in the case of a standard substitution rule. Denoting the number of letters a, b and 
c in any of the words Wa by n a al n b a and n c al respectively, with + n h a + n c a = m, and 
analogously for wj? and Wc \ we can summarise the letter-counting for the generalised 
Brinkhuis triple in a 3 x 3 substitution matrix 



In general, all entries of this matrix are positive integers, because there are no square-free 
words of length m > 3 with only two letters. The (right) Perron-Frobenius eigenvector is 
thus positive, and its components encode the letter frequencies of the infinite words ob- 
tained by repeated application of the substitution rules. The Perron-Frobenius eigenvalue 
is m, because (1, 1, 1) is a left eigenvector with eigenvalue m. 

As mentioned previously, the generalised Brinkhuis triples considered in ^2] do not 
have the property that the letter frequencies of the substitution words coincide. However, 
if we have a generalised Brinkhuis triple, any subset of substitutions also forms a triple, 
because all we do is restricting to a subset of words which still are square-free. So by 
looking at the triples of [12] and selecting suitable subsets of substitutions, we can use 
the same arguments to prove exponential growth of words with fixed letter frequencies. 

4.1 Equal letter frequencies 

Let us first consider the case of equal frequencies f a = fb = f c = 1/3. We note that 
the special Brinkhuis triples of J21 had the additional property that w ® = a(w { a j) ) and 
where a is the permutation of letters defined by a(a) = b and a(b) = c. 
If we select a subset of the words replacing a such that they have the same numbers of 




(10) 
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letters n®, n b a and n c a , the substitution matrix for the corresponding triple consisting of 
those words and their images under a is 



M = \n b n a „ n c \ (11) 





< 




< 


K 


K 




n b 

1 ' a 


K 



which is symmetric. Hence the right Perron-Frobenius eigenvector is (1, 1, 1)*, and the 
letter frequencies are given by /„ = ff, — f c — 1/3. 

The simplest example is a Brinkhuis triple with m — 18 (see also [21]) which 
explicitly given by 

w^> = abcacbacabacbcacba , 

(12) 

w^p = abcacbcabacabcacba = vffi , 

where denotes Wa read back-to-front, which thus has the same letter numbers n a a = 7, 
n b a = 5 and n c a = 6. So the number of ternary square-free words with letter frequencies 
fa — fb — fc — 1/3 grows at least as 2 1 / 17 . By looking for the largest subsets of words 
with equal letter frequencies in the special Brinkhuis triples of [Ej, we can improve this 
bound. For m = 41, we find 30 words Wa ^ with letter numbers n a a = 14, n b a = 13 and 
n c a = 14, yielding a lower bound of 30 1 / 40 ~ 1.08875 for the exponential of the entropy. 
One of the two triples for m = 43 of contains 39 words with = 14, n b a = 14 and 
n c a = 15. This gives the following result. 

Lemma 1. The entropy «5(|, |, |) of ternary square-free words with letter frequencies 
f a = f b = f c = 1/3 is bounded from below via exp[5(|, |, \)\ > 39 1 / 42 w 1.09115. □ 



Remark. This bound can without doubt be improved, because the triples of jT2j and 
[3*2"] where not optimised to contain the largest number of words of equal frequency. 



4.2 Unequal letter frequencies 

What about words with non-equal letter frequencies? The following square-free substitu- 
tion rule [SHj 

a — > cacbcabacbab 

b — > cabacbcacbab (13) 
c — > cbacbcabcbab 

already shows that infinite words with unequal letter frequencies exist. In this case, the 
substitution matrix is 

/4 4 3\ 

M = 4 4 5 , (14) 
\4 4 4/ 
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and the right Perron- Frobenius eigenvector with eigenvalues 12 is (11, 13, 12)*. Thus 
this substitution leads to a ternary square-free word with letter frequencies f a = 11/36, 
f b = 13/36 and f c = 1/3. 

Can we show that, for some frequencies, there are exponentially many words? Indeed, 
for some examples we can find generalised Brinkhuis triples by choosing subsets of those 
given in ^2]- Here, we restrict ourselves to a few examples. 

Consider the two generating words 



Wi = abcbacabacbcabacabcbacbcabcba (n a = 10, rib = 10, n c = 9) , 
W2 = abcbacabacbcacbacabcacbcabcba (n a = 10, rib = 9, n c = 10) , 



(15) 



(17) 



of a Brinkhuis triple with m = 29 [12J. Choosing = W\, = W\, = er(ii?i), 
= <j(Wi), = <J 2 (w2) and = <r 2 (u> 2 ), where again w denotes the words obtained 
by reversing w, we obtain a Brinkhuis triple with substitution matrix 

/l0 9 9\ 

M — 10 10 10 . (16) 
\9 10 10/ 

The corresponding frequencies are / = (/„, fb, f c ) = (^, |§), and the growth rate for 
this case is at least 2 1//28 . 

Consider now two generating words 

Wi = abcbacabacbabcabacabcacbcabcba {n a = 11, n b = 10, n c — 9) , 
W2 = abcbacabacbcabcbacabcacbcabcba (n a = 10, rib = 10, n c = 10) 

of a Brinkhuis triple with m = 30 [12.]. Choosing = w±, Wa = Wi, wjp = cr(u> 2 ), 
= <j(w2), u>i = cr 2 (w a ) and = a 2 (w a ), where a G {1,2}, we obtain two 
Brinkhuis triples with substitution matrices M a given by 

/ll 10 10\ /ll 10 10 N 

Mi = 10 10 9 1, M 2 = 10 10 10 
\ 9 10 11/ \9 10 10, 

The corresponding frequencies now are fi = (|jj, §j[, fff) an d /2 = (|ajf>lf)j an d the 
growth rates for these examples are at least 2 1//29 . 
Our next examples use the generating words 

w\ = abcacbacabcbabcabacbcabcbacbcacba (n a = 11, n b = 11, n c = 11) 
w 2 = abcacbcabacabcacbabcbacabacbcacba (n a = 12, n b = 10, n c = 11) 

of a Brinkhuis triple with m = 33 12 . Choosing as above itfa = Wi, = w\ 



(19) 



w 



' - er(u> 2 ), = cr(w 2 ), w^' = <7 2 (u> Q ) and = fJ 2 («) Q ), where a G {1,2}, we 
obtain two Brinkhuis triples, this time with substitution matrices M a given by 

jll 11 11\ jll 11 10\ 

Mi = 11 12 11 , M 2 = 11 12 11 . (20) 



b 
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The corresponding frequencies now are f\ = (~, ||, ||) and f 2 = (^j, §§, 1^4") ■ Here, the 
growth rate is at least 2 1 / 32 . 

Finally, we give one example with a rather large deviation from equidistribution of 
letters. This uses three generating words 



Wi = abcacbacabacbcabacabcacbcabacbcacba 
W2 = abcacbcabacbabcbacabcbabcabacbcacba 
W3 = abcacbacabacbcabacabcbabcabacbcacba 



(n a = 13, n b = 10, n c = 12) , 

K = 12, n b = 12, n c = 11) , (21) 

(n a = 13, n b = 11, n c = 11) , 



of a Brinkhuis triple with m = 35 [12J. Choosing wi 1 ^ = w\, = W\, = <y(w 2 ), 
= <j(w2), = <J 2 (ws) and = a 2 (ws), we obtain a Brinkhuis triple with 
substitution matrix 

/13 11 11\ 

M = 10 12 11 , (22) 
\12 12 13/ 

which yields frequencies / = (|, ||, ^). The growth rate is at least 2 1 / 34 . 
To summarise, we proved the following. 

Lemma 2. The entropy of ternary square-free words with fixed letter frequency f a is 

•i-trir-thi nnsirtnp fnr f <= fl® H 28 271 31 331_ 280 _341_ 1 271 11 10 6 1 n 
ou lUbLy yuaiLiuc jui j a c \ 51 , 2 8' 87' 841' 96' 1024' 841' 1024' 3' 812' 32' 29' 17 J ' ^ 

One should expect that the entropy is strictly positive for all letter frequencies f a 
in an interval. However, it is not straightforward to show that by using substitutions 
of Brinkhuis triples with different letter frequencies. The reason is that, in general, the 
infinite words obtained by such substitutions do not have well-defined letter frequencies. 

In the following sections, we are going to use methods from the theory of generating 
functions and convex analysis which are often applied in the context of statistical 
mechanics ^7] . The free energy of square-free words, which we will define below, is related 
to the entropy function of square-free words with fixed letter density, as follows from 
Proposition |21 An immediate consequence of the concavity of the entropy function is that 
the entropy is strictly positive for all frequencies f a G (16/51,6/17) ~ (0.3137,0.3529), 
see below. 



5 Free energy 

Since the language of square-free words is subword closed, the numbers s n ^ satisfy the 
submultiplicative inequality 

k 

Sn+m,k — ^ ^ s n,l Sm,k—l ■ (23) 
1=0 

Consider the functions s n (q) defined by s n (q) = X]fc=o s «. fc 1 k - These are polynomials in q 
of degree not larger than n. The submultiplicative inequality (|23|) implies for the functions 
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s n (q) that s n+m (q) < s n (q) s m (q) for < q < oo. We are interested in the exponential 
growth rate of s n (q). To this end, define F n (q) := ^\ogs n (q). The submultiplicative 
inequality yields [T7| Lemma A.l] that the limit F(q) := lim n ^ 00 F n (g) exists, and that 
F(q) < oo for < q < oo. The function F(q) is called the free energy of the model. More 
can be said about the properties of the free energy by using convexity arguments. These 
are largely independent of the underlying combinatorial model and are discussed in detail 
in [13 Sec. 2.1, App. B]. We obtain 

Proposition 1. The functions F n (q) = ^logs n (g) of ternary square-free words are con- 
tinuous, analytic and convex in logg in (0, oo). The free energy F(q) of ternary square- 
free words 

F(q) = lim F n (q) (24) 

n— >oo 

exists and satisfies F(q) < oo for q G (0, oo). Moreover, it is a convex function of logg 
for q G (0, oo). If F(q) is finite, its right- and left- derivatives exist everywhere in (0, oo), 
and they are non- decreasing functions of q. The function F(q) is differentiate almost 
everywhere, and wherever the derivative dF(q)/dq exists, it is given by lim^oo dF n (q)/dq. 

□ 

In the following, we will apply the results of the preceding section in order to derive 
bounds on the free energy. This will show that the free energy F(q) is finite for < q < oo. 
Using the above substitution rule (fT3|) and the substitution rule given in [SSI, we ^ YS ^ 
derive a lower bound on the free energy. 

Lemma 3. The free energy F(q) is bounded from below by 

{64 13 1 

233 logq >3Q logq \ ■ ( 25 ) 

Proof. Consider ternary square-free words w n of length n = 12k, where k 6 N, generated 
by the substitution rule (fTHjl . with wi = c. Define k + {n) = 13n/36 + S + (n), which 
denotes the number of letters of type a in w n . Note that S + (n) = o{n). We have s n (q) > 
s n ,k + (n)Q k+ ^ nS> ■ Taking the logarithm, dividing by n and performing the limit leads to 
F(q) > || log q. The second part of the statement follows by the same argument with the 
substitution rule given in jSSl- D 

Remark. A weaker bound with 64/233 replaced by 11/36 > 64/233 may be derived 
using the substitution (|T3*j) . where the role of a and b are interchanged. 

We now turn to the question of an upper bound, which can be analysed using the 
bounds for letter frequencies obtained in jS21 EI! or m Theorem |21 

Lemma 4. The free energy F(q) of ternary square-free words is bounded from above by 

. > f 1780 469 , 1 
F(q) < — log x c + max < logo, log(7> (26) 

where x c = lim^oo sli ~ 0.768189 is the critical point of ternary square-free words. 
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Proof. Assume that g ^ 1. (The case q = 1 has been discussed in Section 2, where 
F(l) = — logx c was proven.) Assume that B n and A n are numbers such that s U) k = for 
k > B n or k < A n , s nt B n > 0, and s n ^A n > 0. For 1 ^ q G (0, oo) we have the estimate 

s„(g) < s n y^q k = sj f—. (27) 

A q ~ 1 

Assume that q > 1. Taking the logarithm, dividing by n and performing the limit 
n — > oo, this implies F(g) < logx c + e+logg, where e + = limsup^^ B n /n. Note that 
e + < 469/1201, as follows from the bound given in |34j . A similar argument holds for 
q < 1, involving the lower bound A n . From we get the bound 1780/6481. Combining 
the two results, we get the inequality ()26|). □ 

Remark. A weaker bound with (1780/6481,469/1201) replaced by (31/117,39/97) fol- 
lows from Theorem |21 

Define the two- variable generating function S(x, q) 

oo n oo 

Denote the radius of convergence of S(x,q) by x c (q). The curve x c (q) is called critical 
curve, and the plot of x c (q) in the xg-plane is called the phase diagram of the model. The 
free energy is related to the critical curve by 

x^qy 1 = lim s n {qf' n = e F ^ . (29) 

n^oo 

We set x c = x c (l) for the critical point of ternary square-free words. Bounds on the curve 
x c (q) can be derived from bounds on the free energy F(q) as given above. This yields 

x c min{g- 178 °/ 6481 ,g- 469 / 1201 } < x c (q) < min{g^ 64 / 233 , g" 13 / 36 } . (30) 

The phase diagram is shown in Fig. El Using the series data from exact enumeration for 
length n < 100, we extrapolated the values of x c (q) for different values of q, using first 
order differential approximants ^H]- The critical curve x c (q) is, within the analysed range 
of q, very close to the curve x c g~ 1//3 , reflecting the fact that the values k = k{n) where 
s n ,k 7^ are sharply concentrated around k — \n/?>\. For large values of q, such a form is, 
however, not compatible with the derived bounds on x c (q). Numerical analysis suggests 
that the leading divergence of S(x, q) is a simple pole, which is approached uniformly 
in x and q. Thus, there is no indication that the nature of the singularity changes, in 
contrast to other examples from statistical mechanics, where such a change indicates a 
phase transition [T7j . 
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Figure 2: Phase diagram of ternary square-free words, as extrapolated from exact enumeration 
data (circles). Upper and lower bounds on x c {q) are drawn for comparison. 

6 Entropy and symmetry 

We now address the question of the number of ternary square-free words, where we fix 
the frequency of letters of type a. We consider the number of square-free words s n> |_ en j 
in n letters with \_tn\ occurrences of the letter a. The number e may thus be regarded 
as the frequency of the letter a. We are interested in the exponential growth rate of 
Sn,\en\- This leads to the question whether sequences of the form - logs^i^j have a limit 
as n — > oo, which we then call entropy function P(e). It is related to the free energy F(q) 
by a Legendre-Fenchel transform, as we will now show. 

Note that there is a constant K > such that < s n ^ < K n for each value of n and 
k. This follows from the existence of the entropy s of ternary square-free words. Note also 
that there exists a finite constant C > 0, and numbers A n and B n such that s n> ^ n > and 
s n ,B„ > 0, and s n> k > 0, when < A n < k < B n < Cn. This follows from the substitution 
rule (jlHj) . Take A n and B n such that s n> k = if k < A n or k > B n . Define the numbers 

B A 
e + = limsup — , e_ = liminf — . (31) 

n^oo Tl n^oo n 

From and the substitution rule ©, we have 0.361 w 13/36 < e+ < 469/1201 w 

0.391 and 0.274649 w 1780/6481 < e < 64/233 w 0.274678. Thus, the assumptions in 
[T7] Thm. 3.19] are satisfied, and we obtain 
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Proposition 2. The entropy function P(e) of ternary square-free words exists in (e_, e+) 
and is defined by 

P(e) = inf {F(q)-e\ogq}. (32) 

0<g<oo 

Moreover, there is a sequence of integers {cr n }^L such that o~ n = o(n) and the limit 

P(e) = lim -logs n i en i +crn (33) 

exists and is finite and concave in (e_,e + ). Lastly, note also that S n = [en\ + a n is the 
least value of k that maximises s Ut kq k , where q is that value of q where the infimum is 
taken in PI. □ 



Remark. Together with Lemma 2, an immediate consequence of the concavity of the en- 
tropy function is that the entropy is strictly positive for all frequencies e G (16/51, 6/17) ~ 
(0.3137,0.3529). 

We consider now the question where the entropy function takes its maximum. To 
this end, we assume a special regularity condition on the free energy, whose validity is 
supported by the numerical analysis of the preceding section, see also the discussion in 
the conclusion. 

Lemma 5. Let e G (e_,e+). If F{q) G C 2 (0, oo), and if F(q) is strictly convex in log 9, 
we have P(e) G C 2 (e_,e + ) for the entropy function, and it is given by 

P(e) = F(q(e)) - elogg(e), (34) 

where q(e) is the unique positive solution of 

< = ^Fto. (35) 
The entropy function P(e) attains its global maximum at q = 1. 

Proof. Since F(q) is convex in logg and continuous, and F(q) > max{e_ log q, e + logg}, 
the infimum in (J32"j) occurs at a unique value q = q(e) G (0, 00). Since F(q) G C 1 (0, 00), 
we obtain e = qF'(q) = d ng gg \ P(o) as an implicit equation for q(e). This uniquely 
defines a positive function q = q(e) G C 1 (e_,e + ), since strict convexity of F(q) and 
F(q) G C 2 (0, 00) implies d hf &q ^ F{<l) 7^ 0. We have explicitly P'(e) = — logg(e), which 

shows that P{e) G C 2 (e_,e+), and -00 < P"(e) = -(^^^(g))" 1 < 0. This implies 
that q = 1 is a local maximum of P(e). Due to the concavity of P(e), it is the global 
maximum. □ 

We note that at q = 1, the letter density e = F'(l) is the mean letter density, which 
was determined above to be e = 1/3 by a symmetry argument. Thus, under the above 
regularity assumption, maximum entropy occurs at equal (mean) letter density e a = e& = 
e c = 1/3. This is an example of the more general result that maximum entropy in occurs 
at points of maximum symmetry, see for the concept of symmetry and its implications 
for the free energy and entropy of the combinatorial problem of random tilings, which is 
applicable in this case. 
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7 Conclusions 



In this article, we considered the growth rate, or the entropy, of the set of ternary square- 
free words. By computing generating functions S^'(x) for length-^ square-free words, 
where the condition of square-freeness is truncated at length £, we verified an upper 
bound proposed in and slightly improved it. The pattern of poles of these generating 
functions, and their behaviour as I increases, points towards a natural boundary for the 
generating function S(x). 

The presence of a natural boundary in a model indicates that it cannot be solved ex- 
actly in terms of standard functions of mathematical physics, which obey linear differential 
equations with polynomial coefficients [TSj. This would exclude, for ternary square-free 
words, an exact value for the entropy and the functional form of the free energy. It 
may even be difficult to prove the existence of a critical exponent, compare the related 
self-avoiding walk problem [TTj . 

In the ternary alphabet, no letter is preferred by the condition of square-freeness. Thus, 
averaging over the entire sets of ternary square-free words, all letters appear equally often. 
However, in a single infinite word this need not be the case, indeed, the letter frequency 
may not be well-defined. However, one can derive limits on the minimum or maximum 
frequency of a given letter in an infinite ternary square-free words, and by explicitly 
constructing infinite words with given well-defined frequencies by means of substitution 
rules the minimum and maximum frequency can be bounded from above and below. We 
obtained limits from counting square-free words up to a certain length, sharper limits 
were given recently in [331 EI]- The bounds for the maximum frequency can certainly be 
further improved employing the approach of [T§J IHH] • 

Lower bounds on the entropy are based on Brinkhuis triples and their generalisations. 
We used these to prove that, for a list of rational values, the entropy of the set of square- 
free words with a fixed letter frequency is strictly positive. Together with the concavity of 
the entropy function, obtained by methods of convex analysis and statistical mechanics, 
this led to the result that the entropy is strictly positive on an entire interval. 

Concerning the entropy function, it would be interesting to extend the interval of 
strict positivity by providing sharper bounds from suitable substitution rules. This might 
be achievable by following and suitably modifying the approach taken in [T^J 1201 E31 ■ It 
is conceivable, albeit not necessary, that there exists a region of frequencies for which 
infinite square-free words exist, but the entropy vanishes, because the number of square- 
free words with that given letter frequency grows sub-exponentially. Such behaviour has 
been reported for /cth-power-free binary square-free words with rational powers in the 
range 2 < k <7/3 [T3|. 

Further, it is necessary to prove the validity of the regularity assumption on the free 
energy in Theorem |5J In contrast to other problems in statistical mechanics ^7], there is 
no indication of a phase transition in the model of ternary square-free words, wherefore 
an analytic free energy is expected. 

It would also be interesting to analyse the letter distribution using probabilistic meth- 
ods. Similar examples lead, in an appropriate scaling limit, to Gaussian distribution 
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functions [23 . 
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